Data Partitioning Strategy of GPU Heterogeneous Clusters Based on Learning

نویسندگان

  • Jianjiang Li
  • Wei Chen
  • Hongyan Zheng
  • Peng Zhang
  • Yajun Liu
چکیده

With the rapid progress of computational science and computer simulation ability, a lot of properties can be predicted by the powerful ability of parallel computation before the actual research and development. With the development of high performance computer architecture, GPU is more and more widely used in high performance computation field as an emerging architecture, and a growing number of computations use GPU heterogeneous cluster architecture. However, how to partition workload and map to computing resource has always been the focus and difficult point. In the current study of GPU, according to the problems of the computing power provided by each node and the cluster hardware architecture which the application programmers don't understand, some partitioning strategies will result in serious load imbalance problem. Aimed at the complexity brought by the different computing ability of the nodes of GPU clusters, this paper proposes a GPU data partitioning strategy of heterogeneous clusters based on learning. It collects the states of each node in the process of running a program, and then estimates the calculation ability of each node dynamically, so as to guide the data partitioning. Actual testing results show that, this strategy allocates different tasks to nodes based on computing ability to ensure load balancing among nodes, so as to improve the execution performance of CUDA programs on heterogeneous GPU clusters and it laid a solid foundation for efficient computing on heterogeneous GPU clusters.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Self-adaptive Workload Balancing Algorithm on GPU Clusters

With the wide application of GPU in High Performance Computing, more and more heterogeneous CPU+GPU clusters have been established in many fields. But with the comprehensive using of heterogeneous CPU+GPU clusters, workload balancing has become an important problem when the process nodes coordinate with each other, and the execution time of a program on imbalanced clusters resides on the slowes...

متن کامل

OpenCL Task Partitioning in the Presence of GPU Contention

Heterogeneous multiand many-core systems are increasingly prevalent in the desktop and mobile domains. On these systems it is common for programs to compete with co-running programs for resources. While multi-task scheduling for CPUs is a well-studied area, how to partitioning and map computing tasks onto the hetergeneous system in the presence of GPU contention (i.e. multiple programs compete ...

متن کامل

Separating Well Log Data to Train Support Vector Machines for Lithology Prediction in a Heterogeneous Carbonate Reservoir

The prediction of lithology is necessary in all areas of petroleum engineering. This means that to design a project in any branch of petroleum engineering, the lithology must be well known. Support vector machines (SVM’s) use an analytical approach to classification based on statistical learning theory, the principles of structural risk minimization, and empirical risk minimization. In this res...

متن کامل

High-Dimensional Unsupervised Active Learning Method

In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...

متن کامل

Accelerating high-order WENO schemes using two heterogeneous GPUs

A double-GPU code is developed to accelerate WENO schemes. The test problem is a compressible viscous flow. The convective terms are discretized using third- to ninth-order WENO schemes and the viscous terms are discretized by the standard fourth-order central scheme. The code written in CUDA programming language is developed by modifying a single-GPU code. The OpenMP library is used for parall...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016